Version 1.7 - January 1998
If you don't know any, ask your local post office and inform yourself on how to make a payment to The UNICEF.
The amount of the offer is up to you , but please do it!
Before running this program on your computer, please read carefully the following paragraph, and continue only if you agree with the terms written below.
THE HTTX AUTHOR IS IN NO WAY RESPONSIBLE FOR MORAL AND/OR MATERIAL DAMAGES THAT HIS PROGRAM MAY CAUSE TO PEOPLE OR THINGS. THE PROGRAMMER GAVE HIS BEST TO LIMIT THE PROBLEMS THAT HTTX MAY CAUSE, BUT HE IS NOT ABLE TO GUARANTEE ITS EFFICIENCY IN ALL THE SITUATIONS. USING HTTX, YOU, THE USER, ARE RESPONSIBLE FOR ALL MORAL, MATERIAL, CIVIL AND PENAL THINGS.
WARNING:
Many HTML documents are under Copyright, and are not freely distributed, even if converted to plain text format. The author declines every responsibility in the utilization of the files generated with HTTX.
All of the programs mentioned in this document are properties of their respective owners.
The executable program, the source code and the ideas that are its basis are the EXCLUSIVE PROPERTY of Gabriele Favrin. All rights reserved.
HTTX is freeware, NOT public domain. It may be spread only if the executable files and the documentation remain unchanged. Distribute of the files in archive formats other from LhA is permitted, but compressing the individual files using PowerPacker or similar tools is not.
The insertion of HTTX or its parts in the cover disks of magazines is granted only by authorization from the author.
Commercial utilization of this package is exclusively granted to AmiTrix and Yvon Rozijn (AWeb).
Aminet, Fred Fish, Meeting Pearls, Amy Resource and CU Amiga Magazine staffs are authorized to include HTTX in their public domain software collections.
HTTX (HTml > TXt) is a program to convert files from HTML format, used for viewing files on World Wide Web, to pure ASCII. There are analogous products, but since none had completely satisfied my needs, I started to write one myself.
I don't say this is the best or the fastest one, but surely it has some functions unpublished in similar Amiga programs till now.
C:
directory or a directory in your current path.
HTTX MUST be launched from Shell.
Command syntax:
HTTX InputFile [OutputFile] [options]
The parameters in square brackets are optional. You are only required to specify a valid HTML file ("InputFile").
If there is no OutputFile specified, it defaults to 'InputFile'.txt (e.g. "test.html" will be saved as "test.html.txt"). If a path is specified for OutputFile, that file will be saved to that path.
Examples:
HTTX data:txt/html/aboxe.html
The file "aboxe.html" will be converted and saved as
"data:txt/html/aboxe.html.txt"
HTTX data:txt/html/aboxe.html ram:aboxe.txt
The file "aboxe.html"
will be converted and saved as
"ram:aboxe.txt"
HTTX data:txt/html/aboxe.html data:txt/
The file "aboxe.html"
will be converted and saved as
"data:txt/aboxe.html.txt"
HTTX offers many options to control the conversion process.
Default: 77 - Minimum: 15 - Maximum: 255
Default: 3 - Minimum: 1 - Maximum: (LEN value - 10) / 3
Not to be used if the converted text will go on message areas, like Fidonet or Usenet newsgroups.
Please read the "ANSI conversion" section for important informations about ANSI sequences and general compatibility issues.
Default: OFF (styles are not converted).
IMPORTANT: the ANSI option adds Escape codes (ASCII 27), forbidden on FidoNet, and strongly not recommended for a non personal use (broadcast) of converted text.
Default: OFF (8-bit chars are not converted).
Default: HRMODE=1 (lines are inserted using the minus "-" character).
Examples:
If NOALIGN option is ON, both the above lines will start on left margin, this saves characters.
Default: OFF (alignment is rightly converted).
Default: OFF (the title is not saved as the comment, but it can still appear inside the output file).
Example:
HTTX ram:children.html SITE=http://www.unicef.org
will start the file with "URL : http://www.unicef.org"
Note: SITE has priority over GETNOTE, so specifying a site this way will override that option if it is active.
Default: OFF (without this option the URL will not be added).
Default: OFF.
Default: OFF (HTTX version, optional title and URL may be added).
Default: OFF (links aren't added).
Default: OFF (ALT-text isn't added).
Please read the "notes about conversion of <PRE>, <XMP>, <LISTING> and <SCRIPT> contents" section for important information about conversion of this type of text.
Default: OFF (<SCRIPT> content is skipped).
Default: OFF (HTTX uses standard DTD rules).
USE IT AT YOUR OWN RISK: conversion of text or binary files may cause unpredictable results.
Normally, HTTX considers a file valid HTML when:
This option must be specified if the three above conditions are false, even if the file IS an HTML document.
Default: OFF (automatic check of file).
Default: OFF (converted file is saved to disk).
The printer.device will convert standard ANSI codes and end-of-lines to the ones used by the Printer set in your Preferences.
This option should be used if you want to print the converted document, especially if the ANSI option is enabled, because ANSI codes used for conversion are different from the printer ones, which are more generic.
Older versions of HTTX used a solution like
"HTTX aboxe.html prt:"
, which is now to avoid.
This option automatically enables QUIET, and turns off FILENOTE and STDIO options.
Default: OFF (document is displayed on screen or saved to a file).
If APPEND is ON, the converted text will be added to the end of the specified file.
Default: OFF (overwrite output file if it already exists).
ENV:httx.prefs
(if
another is not specified with the CFG option). If this option is ON, HTTX
uses the default values for the options or the parameters specified in the
command.For more informations see "external configuration" section.
Default: OFF (HTTX searches for its configuration).
ENV:
directory.
This option turns NOCFG OFF.
For more informations see
"external configuration" section.
Default: OFF (HTTX loads the httx.prefs configuration file).
The text file is NOT ALTERED IN ANY WAY, no 8-bit character conversion, wordwrap, ANSI codes and so on. HTTX will not warn if 8-bit characters are included.
Remember this, especially if the converted text will go on message areas, like Fidonet or Usenet newsgroups, where 8-bit chars are not allowed.
Default: OFF (no text file is included in the output file).
WARNING: if active, this option also hides error messages, but the AmigaDOS error codes are always returned.
Default: OFF (HTTX output is displayed).
If not specified, HTTX uses the default settings.
When conversion is finished, if QUIET option is OFF, HTTX will show:
HTTX supports an external configuration, this is a text file that includes the most used options, so they do not need to be typed every time you use HTTX.
By default (except when NOCFG option is set, or CFG option with a different filename) HTTX searches the file "ENV:httx.prefs". It's possible to create multiple configurations, maybe one to use for file conversion and another one to use for printing, creating different configuration files and enabling the CFG option with the name of the file (do not specify the path, it is always "ENV:"). Example:
HTTX aboxe.html
Converts "aboxe.html"
; using default configuration (ENV:httx.prefs).
HTTX aboxe.html PRINT CFG=httxprt.prefs
Converts "aboxe.html"
using the configuration file
ENV:httxprt.prefs
The file must contain ONLY the options and their possible parameters. It's allowed to put each option on a separate line for better readability.
Available options (for description see "Command line parameters" section):
Parameters specified on command line acts after the parameters specified in the configuration file. This can eventually override (or toggle twice) one or more options.
Examples:
If a configuration file has the following line:
IMG GETNOTE LEN=70
and on command line you write:
HTTX aboxe.html IMG
the result is IMG turned on because it's present in configuration line, but turned off again because it's also present in command line.
HTTX aboxe.html LEN=74
LEN is both present in configuration file and command line, but this one overrides the previous value. LEN is now set to 74.
External configuration files are in effect system variables and are located
in ENVARC:
directory (on disk) and ENV:
(generally
on RAM). So, the contents in ENV: are valid only for the current session,
while the contents in ENVARC: are also valid after a reset.
To permanently save a configuration file, copy it both in ENV: and ENVARC:.
Use your favorite text editor (Ed, Cygnus Editor, GoldEd, and so on) to
create your prefs file, httx.prefs is the default filename. Save it in
ENV:
, also save the file to ENVARC:
so it will not
be lost when you reboot. Temporary changes can be made by editing just the
ENV: file.
HTTX configuration may be fully managed using the plugin for the AWeb WWW browser.
When execution terminates, HTTX returns the appropriate AmigaDOS Return Code (RC), usable within scripts to determine if the conversion was successful. See your AmigaDOS handbook for a complete list of error codes.
In case of error, if QUIET is off, the appropriate AmigaDOS message will be displayed.
Following is a list of the most common errors. If the system is localized, messages are displayed in the appropriate language. See your AmigaDOS manual for further information.
HTTX can shows other errors (in English only) due to wrong use of commands or options:
Finally, there are a few warnings which may be displayed. The conversion will take place but there may be situations altering the final result:
ENV:
.
Remember also to copy it again to ENVARC:
when you edit it.
Q. "ANSI styles (bold, italic, underline, blue) stop after first line."
A. See the notes about ANSI conversion.
A. "Converted text isn't centered, but in the original document it is."
Q. This can happen if the text in a table row (<TR>)
or cell (<TD>) is defined as centered. To maintain compatibility to
some programs used with HTTX, this version doesn't yet supports alignment
defined in those elements. This will be added in future versions that will
have more table support.
Q. "Sometimes alignment doesn't work, wordwrap and
lists are not correctly formatted or HTML TAGS are shown".
A. It's the text included between <PRE> TAGS. HTTX
copies the text as is, without formatting. This choice was made because often
that kind of text contains sources that the author probably wishes to keep as
is.
In <LISTING> and <XMP> the TAGS are leaved as they are, as specifications for those elements define. Although its use is deprecated in HTML 4.0, <XMP> is still largely used for examples in many documents. eg. in the Netscape JavaScript specifications.
Q. "Some pages are not correctly converted..."
A. There could be many reasons: layout based on tables
(not fully supported, see "What is supported"
section), errors on HTML source (HTTX is quite tolerant, but there are
limits) or errors on HTTX engine. If you think the page is correct, send me
an E-mail with the URL.
(E-Mail: favrin@tin.it, FidoNet: 2:333/726.8)
Q. "How can I directly use HTTX from AWeb or Directory
Opus?"
A. AWeb users of version 3.1 or better can use the enclosed Arexx plugin.
HTTX can be used from Directory Opus by creating a button configured as follow (Directory Opus 4.12):
New Entry/AmigaDOS:
C:HTTX {f} {d}
(replace C: with path for HTTX)
With this configuration, a file selected from "source" directory will be converted to text and saved to the "destination" directory.
By activating "Do all files"
flag it's possible to
convert more than one file, by selecting them and clicking the HTTX
button.
Q. "How can I improve the performance of
HTTX?"
A. To speed up the conversion, try using a filesystem with
block of 1024 bytes, like RAM disk. Note that if memory is almost full or
fragmented, saving to RAM disk will may slow down the conversion
process.
This section talks about some thematics of HTML and its implementation in HTTX. Although reading this is not required to learn how to use HTTX, there is important information about conversion that you should read if you plan to distribute your converted texts.
Supported HTML:
What is (not yet fully) supported:
Implementation of the standard:
These rules are followed except in <SCRIPT>, <PRE>, <LISTING> and <XMP> elements). See "notes about conversion of <PRE>, <XMP>, <LISTING> and <SCRIPT> contents" for more informations about this.
If ANSI option is enabled HTTX uses standard ANSI escape sequences for converting HTML styles (such as bold, italic, underlined and so on...), links (rendered as underlined blue), centering and indentation of text.
The ANSI codes used are taken from standard ANSI specifications and should be
supported by any program (maybe...).
These are the sequences (ESC is replaced by "\e"):
\e[1m
\e[3m
\e[4m
\e[33m
Multiple ANSI definition is done using the ";", ie. to set bold and
italic HTTX uses "\e[1;3m
".
For list indentation and text alignment, HTTX uses the cursor position
sequence "\e[nnC
" where nn is the number of
characters to move right. This sequence is not used when printing.
Compatibility problems:
Rules specified for implementation of the standard, wordwrap of text and 7bit conversion of 8-bit characters aren't totally valid for some elements. The HTML specifications require them to be treated differently.
Remember this if the converted text will go on message areas, like Fidonet or Usenet newsgroups!
Beyond every communication, problem, bug report, advise or other things, a comment about HTTX will be appreciated, and information of actions toward corporations who takes care of children (see "CHILDWARE").
E-Mail : favrin@tin.it
FidoNet: 2:333/726.8
Please write me in italian or english, thank you.
HTTX support page:
http://www.aspide.it/freeweb/poing/soft/httx/index.html
Beta testing of version 1.7:
HTTX 1.5 english documentation and spell check of 1.7:
Beta testing of AWeb Plugin:
Very much thanks to Yvon Rozijn for having wanted HTTX inside AWeb II and for all the help he gives me!
Thanks to W. v. Oortmerssen for the splendid AmigaE, used to realize HTTX.
Finally special greetings to those who wrote me about HTTX and to those who use it!
V1.0 (July 1996)
V1.1 (November 1996)
V1.1a (January 1997)
V1.1b
V1.5 (May 1997)
V1.7 (January 1998)
HTTX is a program in continuous growth, because I daily use it, I notice lacks or possible improvements. This version will be the basis for a new even better version that will be out... soon.
If you don't know any, ask your local post office and inform yourself on how to make a payment to The UNICEF.
The amount of the offer is up to you , but please do it!